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Abstract We consider an asexual biological population of constant size N evolving 
in discrete time under the influence of selection and mutation. Beneficial mutations 
appear at rate U and their selective effects s are drawn from a distribution g(s). After 
introducing the required models and concepts of mathematical population genetics, 
we review different approaches to computing the speed of logarithmic fitness increase 
as a function of N, U and g(s). We present an exact solution of the infinite population 
size limit and provide an estimate of the population size beyond which it is valid. We 
then discuss approximate approaches to the finite population problem, distinguishing 
between the case of a single selection coefficient, g(s) = 8(s — si,), and a continuous 
distribution of selection coefficients. Analytic estimates for the speed are compared 
to numerical simulations up to population sizes of order 10 300 . 

Keywords evolutionary dynamics; Wright-Fisher model; clonal interference; 
traveling waves 



1 Introduction 

The foundations of mathematical population genetics were established around 1930 
in three seminal works of R. A. Fisher (TU, J.B.S. Haldane J25J and S. Wright floTl 
The achievement of these three pioneers is often referred to as the modern synthe- 
sis, because they resolved an apparent contradiction between Darwinian evolution- 
ary theory, with its emphasis on minute changes accumulating over long times, and 
the then recently rediscovered laws of Mendelian genetics, which showed that the 
hereditary material underlying these changes is intrinsically discrete. Like Ludwig 
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Boltzmann faced with the problem of deriving the laws of continuum thermodynam- 
ics from atomistic models, Fisher, Haldane and Wright developed a statistical theory 
of evolution to explain how random mutational events occurring in single individuals 
result in deterministic adaptive changes on the level of populations. Not surprisingly, 
then, statistical physicists always have been, and are now increasingly attracted to the 
study of evolutionary phenomena in biology (see e.g. Ill UI37lfT7l [3l). 

In this article we focus on a specific, rather elementary question in the mathemat- 
ical theory of evolution, which was posed in the early days of the field and remains 
only partly understood even today: We ask how rapidly an asexually reproducing, 
large population adapts to a novel environment by generating and incorporating ben- 
eficial mutations. The question originates in the context of the Fisher-Muller hypoth- 
esis for the evolutionary advantage of sexual vs. asexual reproduction. Fisher |[T8l 
and H.J. Muller iRTI pointed out that a disadvantage for asexual reproduction would 
arise in populations that are sufficiently large to simultaneously accommodate several 
clones of beneficial mutants. In the absence of sexual recombination, two beneficial 
mutations that have appeared in different individuals can be combined into a single 
genome only if the second mutation occurs in the offspring of the first mutant. This 
places a limit on the speed with which the population fitness increases in the asexual 
population. 

The first quantitative treatment of the Fisher-Muller effect was presented by Crow 
and Kimura Q for a model in which all beneficial mutations are assumed to have the 
same effect on the fitness of the individuals. They arrived at an expression for the 
speed of evolution in asexuals which saturates to a finite value in the limit of large 
population size N — > °°, whereas for sexual populations the speed increases propor- 
tional to N. This conclusion was challenged by Maynard Smith l38l . who showed 
(for a model with only two possible mutations) that recombination has no effect on 
the speed of adaptation in an infinite population. The resolution of the controversy [8] 
[39l[T6l made it clear that the Fisher-Muller effect operates in large, but not in infinite 
populations; a first indication of the rather subtle role of population size, which will 
be a recurrent theme throughout this article. 

Prompted by progress in experimental evolution studies with microbial popula- 
tions [57 51 , 13 56 26 47], the question of the speed of evolution in the setting of 
Crow and Kimura has been reconsidered by several authors in recent years 15011 10l l9l 
I2ll6l l49ll64ll63l . Using a variety of approaches, they show that, rather than approach- 
ing a limit for large N, the speed grows as InN in the regime of practical interest, 
reflecting the increasing spread of the population distribution along the fitness axis. 
Considerable efforts have been devoted to deriving accurate expressions for pref ac- 
tors and sub-asymptotic corrections. At the same time more complex models that 
allow for a distribution of mutational effects have been introduced and analyzed 12T1 

Eaaana. 

The purpose of this article is to review these developments on a level that is ac- 
cessible to statistical physicists with no prior knowledge of population genetics. In 
the next section we therefore begin by introducing the basic concepts and models, 
primarily the discrete time Wright-Fisher model with mutations and selection. Sec- 
tion [3] is devoted to the dynamics of an infinitely large population. In this limit the 
dynamics becomes deterministic and can be solved exactly using generating function 
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techniques. Although (as we will show) real populations operate very far from this 
limit, the infinite population behavior serves as a benchmark for the comparison with 
approximate results for finite populations, and it yields the important insight that a 
large population can be described as a traveling wave in fitness space [54, 50). In Sec- 
tion!?] we review the main approaches to the finite population problem. We provide 
simple derivations of reasonably accurate expressions for the speed of evolution, both 
for the case of a single type of beneficial mutations and for models with a distribution 
of mutational effects, which are compared to stochastic simulations over a wide range 
of population sizes. A preliminary view of the relationship between the two types of 
models is presented. Finally, in Section|5]we summarize the article and discuss some 
related topics which point to possible directions for future research. 

2 Models 

This section introduces the basic concepts and models studied in this paper. Models of 
evolving populations are based on three main features: reproduction with inheritance, 
natural selection, and mutatiorQ. We describe each of these features from the point of 
view of stochastic processes in discrete time. For ease of explanation, our description 
begins with the branching process well-known in the statistical physics community. 

2.1 Wright-Fisher Model 

We consider here only asexual reproduction that is described by the number of off- 
spring that each individual produces. This number is different from one individual to 
another, depends on many external events, and is thus described by a random vari- 
able. In the discrete time branching process without selection, an individual at time 
(or generation) t is replaced by n t+ \ individuals at time t + 1 where n t+ \ is distributed 
according to a law p(n) that is the same for all individuals^ and is constant in time. 
The probability p(Q) can be seen as the death probability since the lineage of the 
individual disappears. The population is then completely described by its total size 
N t . This stochastic process is known as the Bienayme-Galton- Watson process and de- 
scribes the growth and the death of a population without restriction on the size. Sim- 
ple computations shows that the average size grows as E(iV f ) « n' where « = 
is the average number of children of one individual JT5). This simple system exhibits 
a transition to an absorbing (extinct) state as it varies. When h < 1, the extinction 
will occur with probability 1. On the other hand, if h > 1, the population grows expo- 
nentially with a finite probability. However, such a growth is not realistic because of 
limitations of the amount of food or resources in the environment. 

In order to take this saturation effect (or environmental capacity) into account, 
we demand that the size of population remains constant, with a given value N. For 
generality, we now assume the mean number of offspring for each individual i (1 < 
i < N) to be Wj in the unrestricted growth case described above, and we allow the 

1 Other important features this paper does not consider are migration and genetic recombination. 

2 The fact that all individuals have the same distribution law implies the absence of natural selection. 
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t ~ 1 



t+1 

Fig. 1 A cartoon illustrating the Wright-Fisher model for a population of size N = 6 over three generations. 
The aiTows indicate how an individual 'chooses' its parent. 



Wj to be different from each other. To make the discussion concrete, we choose a 
Poisson distribution for the number of offspring of individual i, pi(nf) = w"' e~ w ' /n,!. 
The reproduction mechanism at constant population size can then be modeled by 
conditioning the total number of offspring M = to be equal to N. The joint 
probability of the «, (without restriction) is given by 

nrt*)='-™fv£, (i) 

where w = Y,i w i/N is the mean number of offspring per individual. The probability 
of observing M = N is 

(Nw) N „- 

P(iV)= AH 6 ' (2) 

and, accordingly, the conditioned probability is given by (5 is the Kronecker delta 
symbol) 

/ ,, A 8 M Np(ni)...p(n N ) N\ / Wi y, 

P(^-MN) = m =^- r _ [ ^[(—j , (3) 

which is the widely-used Wright-Fisher (WF) model 116 1 II 1 81 . It becomes then equiv- 
alent to the following process: at time t + 1, each individual 'chooses^ its parent ; 
at time t with probability wi/(Nw); see Fig. Q] as an illustration. It is obvious that 
this scheme is not affected by multiplying all w, by a common factor. Inheritance is 
modeled by conferring to an offspring the same value of w, as its parent. 

However, the inherited genetic material may go through copying errors (or mu- 
tations), which can result in a child's having different characters from its parent. In 
order to take into account the effects of mutations, it is necessary to describe the 
characteristics of each individual. Individuals are usually characterized by a set of pa- 
rameters, the type (either the phenotype, that describes their biological functions and 
their interactions with their environment, or the genotype, that specifies their herita- 
ble genetic material). A type is transmitted from the parent to the children up to some 
changes due to genetic mutations. For our purposes, the most important characteristic 
of an individual is its fitness defined as the average expected number of offspring of 



3 In reality, of course, a child cannot choose its parent, but this usage of the terminology has no mathe- 
matical ambiguity and is widely used in the literature. 
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this individual (even if, for a given realization of the process, the effective number of 
offspring can be different because of the environmental capacity) in the whole popula- 
tion. In the reproduction scheme described above the absolute fitness of individual i is 
thus given by w,, the relative fitness by %i = W{/w, and the probability that individual 
; is chosen as a parent in the WF-model is %i/N. Fitness differences in the population 
imply selection: individuals with large fitnesses tend to generate larger fractions of 
the populations whereas lineages with small fitnesses tend to disappear quickly. We 
return to the question of how fitness is assigned to individuals below in Section |2~2l 

In the language of statistical physics, the WF model as defined above may be seen 
as a mean-field model, because it does not take into account any spatial structure of 
the population: any individual can be the parent of any other, without any considera- 
tion of distance. This assumption is however realistic if one considers the mixing of 
real populations in a not-so-large environment. The role of spatial structures in evo- 
lution has also been studied for simple models, such as the island model ||62l and the 
stepping stone model ||32l which incorporate migration. The present paper focuses 
on mean-field reproduction models. 

The WF model assumes a complete replacement of the population by children 
in one generation, i.e. generations do not overlap. A model with overlapping gener- 
ations may be defined by splitting the replacement of the population over a longer 
time. A frequently used model that includes overlapping generations as well as a lim- 
ited environmental capacity was introduced by Moran fl40l : at each time step, one 
individual chosen at random is killed and replaced by the child of another individual 
chosen with probability %i/N. The time in the Moran model is still discrete, although 
the dynamics is evidently close to a scheme where single individuals are replaced in 
continuous time with rates proportional to the %i- 

Both WF and Moran models have advantages and disadvantages. Unlike the WF 
model, the Moran model is amenable to some exact analysis, see Section |4~T1 for an 
example. However, with regard to computational efficiency, the WF model is superior 
to the Moran model when simulating large populations. Since the conclusions rele- 
vant to biology are mostly insensitive to model details, we will base our discussion 
on the discrete time WF model, and comment on the corresponding continuous time 
or Moran model where appropriate. 



2.2 Fitness landscapes and selection coefficients 

The main difficulty in modelling biological evolution within the framework described 
so far is the choice of the functional relationship w( < ^) between the type of an 
individual and its fitness, referred to as the fitness landscape, which encodes in a 
single parameter the complex interactions of a type with its environment l28l . At 
least two distinct approaches circumvent this difficulty: one can either try to mea- 
sure the function w{ c (g) from experimental data if the set of types is reduced |55], 
or choose the fitness landscape at random from some suitable ensemble. In the last 
case, a widely-used further simplification consists in describing the individuals only 
by their fitnesses and ignoring the underlying structure of the types c €\ mutations are 
then described only by changing the fitness of an individual by a random amount. 
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This can be justified if the number of types is very large, so that every mutation ef- 
fectively generates a new type that has never appeared before in the population. In 
population genetics this is known as the infinite number of sites approach I134U45II . 
and it will be used throughout this paper. 

Each offspring has a probability U per generation of acquiring a mutation and this 
mutation changes the parental fitness w, to the fitness vv'- of the offspring. In this paper, 
mutations are assumed to act multiplicatively on the fitness wi and so the fitness w\ 
after mutation is given by 

u/. = w,-(l+s) (4) 

where the selection coefficient s is a random variable with a distribution g(s). Muta- 
tions with s > are beneficial and those with s < deleterious. Recall that if all the 
Wi are multiplied by the same quantity, then the relative fitnesses %i do not change, 
which justifies the multiplicative action of the mutation^ [281 . One expects the rel- 
ative fitnesses to reach a stationary distribution at long times such that the average 
fitness w(t) will increase (or decrease) exponentially with a rate referred to as the 
speed of evolution 

v/v = lim , (5) 

r^°° t 

where the angular brackets denote an average over all realizations. This speed de- 
pends on the population size N as well as on the mutation rate U and on the distri- 
bution g(s) of the mutation. Two main contributions sum up to give the speed v^: 
the change of mean fitness due to mutations and the selection pressure that selects 
individuals with larger w,-. For the WF model, these contributions are made explicit 
through a result obtained by Guess [2311221: 



v N = U / ln(l + s)g(s)ds + - V ( Xi - 1 ) laxi) , (6) 




where (-)stat indicates an average over the stationary measure of the Xi- Only the 
second term of this expression (which is always nonnegative) is related to selection. 
However it is also the difficult part to study since the stationary distribution of the 
Xi is generally unknown and hard to compute. If the distribution of relative fitness 
Xi is concentrated around 1, the second term can be approximated by the variance of 
the distribution of relative fitness. This result is reminiscent of Fisher's fundamental 
theorem lfl8l which states that the speed of evolution is proportional to the variance 
of the fitness distribution. 

In the present paper the dependence of the speed of evolution on the distribution 
g(s) of selection coefficients is a central theme. As it turns out that deleterious muta- 
tions do not affect the adaptation of large populations when at least some fraction of 
mutations is beneficial, only beneficial mutations (s > 0) will be considered in the 
following. The mutation rate (per generation) U then refers to the rate of beneficial 



4 An alternative scheme where the mutant fitness wj itself is chosen at random was investigated in 1451 . 
see Section[5]for further discussion. 

5 When all mutations are deleterious, the fitness decreases at constant speed and the problem is known 
as Muller's ratchet, see ['50 49 63 27'j and references therein. 
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mutations, which is exceedingly small in natural populations: experimental estimates 
for bacteria range from 10~ 7 to 10~ 4 12611471 . The distribution of selection coeffi- 
cients of beneficial mutations is very difficult to determine experimentally, and the 
choice of a realistic form remains an open question fBl . Moreover, the experimental 
determination of evolutionary parameters such as the mutation rate U and the typical 
size of selection coefficients depends strongly on the assumptions made about the 
shape of g(s) l26l . 

It has been argued that, because viable populations are already well adapted to 
their environment, fitness coefficients associated with beneficial mutations occur in 
the extreme high fitness tail of the underlying 'bare' fitness distribution, and therefore 
the shape of g(s) should be given by one of the invariant distributions of extreme 
value statistics |42i29). Here we will consider two choices for this distribution. The 
first one (model I) describes the situation where all mutations have the same selective 
strength s b , 

g<--\s) = 8(s-s b ). (7) 

The second class of distributions (model II) is supported on the whole non-negative 
real axis and decays as a stretched exponential ||9l [T9l , 

gV\s) = (p/s b )(s/s b )P- l exp(-(s/s b f), (8) 

where the factor (s/st,)^ 1 has been introduced for computational convenience. For 
J3 = 1, one recovers the widely-used exponential distribution 042112 Ill60ll44l . whereas 
for [5 — > oo ^ reduces to @. We note for later reference that the mean ofg m isr(l + 
l/P)sb- Typical values of selection coefficients obtained from evolution experiments 
with bacteria lie in the range s b w 0.01 — 0.05 12611471 . Thus both U and s b can be 
treated as small parameters, with U >C s b , in most of what follows. 



3 Infinite population dynamics 

This section studies the WF model in the infinite population limit which is described 
by a deterministic evolution equation. Some of the material of this section is also 
found in the online supporting information of l44l . Since it was shown in l44l that 
deleterious mutations do not contribute to the speed in the infinite population limit, 
all mutations are assumed to be beneficial in the following. The model will first be 
solved using a discrete set of fitness values, and the transition to a continuous fitness 
space will be performed in Section |3.3.2| 



3.1 The evolution equation and its formal solution 

Let ft(n,k) denote the frequency of individuals with n (beneficial) mutations and 
with fitness e at generation t; here so > and A: is a non-negative integer. Note that 
ft(n,k) does not discern different types which have the same number of mutations 
and the same fitness. The restriction to fitnesses > 1 is irrelevant due to the invariance 
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of the dynamics under multiplication of absolute fitnesses by a common factor. The 
mean fitness of the population at generation t is 

M>(0=£e fa «/ f (n,fe). (9) 

If there are no mutations, the frequency at the next generation is given by 

f t+1 (n,k) = -^e ks °f t (n,k), (10) 

which is equal to the expected frequency at generation t + 1 for a finite population. 

After reproduction, mutations can change the type of the offspring. With proba- 
bility U, mutations hit an individual and with probability 1 — 17 the offspring keeps 
the type inherited from its parent. For simplicity, we assume that a single mutation 
occurs in a single mutation event (see P4l for more general cases). For each mutation 
a positive integer from a distribution go(l) with strictly positive / is drawn and then 
the fitness of the offspring is that of its parent multiplied by e ls ° . It is convenient to 
introduce the generating function of go(l), 

G(Z) = X>W): (ID 
1=1 

with the normalization G(l) = 1. Including the effect of mutations along with the 
selection step in Eq. dTOb . the frequency change becomes 

(»,*) = (1 - U)f t+1 (n,k) + U £/ m (n - l,k - 1)80(1), (12) 

i=i 

which is the main equation to be analyzed in this section. 
The generating function of the frequency 

F,(§,z)=£§"z t /,(n,fc) (13) 

n.k 

satisfies 

F t+ i (§,z) = ^jf^y [1 - U + U$G(z)} , (14) 
where we have used the relations 

w(t)=F t (l,e s °), (16) 

and the property of the convolution. Iterating Eq. ( TBI backwards until the initial time 
gives 

t[q ' Z > F (l,e^) | = 1 l+ M G(e^) ' KU} 
where u = U/(l— U). One can check that Eq. ( fTTb solves Eq. ( fT4b by substitution. 
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3.2 General asymptotic behavior 

Using Eqs. dT6b and dT7l i, the mean fitness at generation t becomes 

F (l,e s '> ( f +1 A 

lnw(0=h ^ (i ^ of) > +\n(l-U + UG(^)). (18) 

If initially there is a finite Kq such that fo(n,k) = for k > Ko, the first term arising 
from the initial condition saturates and does not contribute to the speed in the long 
time limit. On the other hand, if such a Kq does not exit, the initial condition can 
affect the fitness increase indefinitely. For example, let fo(n,k) = <5„.oe~ J ' r\ k jk\ with 
the generating function Fq(^,z) = e^ -1 ). The first term on the right hand side of 
Eq. ( TT~8b then becomes T] (e s ° — 1 ) e s °' which does not allow a finite increase rate even 
in the absence of mutations. This is a peculiarity of the selection dynamics in the infi- 
nite population limit and it is not difficult to understand why this happens. Since the 
selection confers exponential growth to all types with fitness larger than the average 
w(t ) and there are always individuals of such types at any generation t due to the un- 
bounded initial condition, the mean fitness can grow indefinitely without recourse to 
beneficial mutations. Because this is a rather artificial situation which has no biologi- 
cal relevance, we assume the existence of Ko in what follows. Actually, for simplicity 
the initial condition 

fo(n,k) = 8 n0 8 k0 , F (z,£) = l (19) 

will be used throughout this paper. 

As t — > oo, the speed is determined solely by the generating function of beneficial 
mutations. Let K mm = maxk{k : go(k) ^ 0}, then due to the exponential growth of 
the argument of G(e s °') the speed for the infinite size population becomes 

V tx ,=K max S0- (20) 

This shows that the mutation of largest effect governs the speed, which is not sur- 
prising because genetic drift (a term referring to the stochastic loss of a beneficial 
mutation in a finite population, see Section PTTT i is not operative. If we take K nuix — ► oo 
with so fixed, the speed diverges. Note that the speed in the infinite population limit 
does not depend on the mutation rate. It is only determined by the maximum value of 
the fitness increase by a single beneficial mutation event. 

Another peculiarity of the infinite population limit is the possibility that the fitness 
becomes infinite at finite time. If G(z) is not an entire function, the series defining the 
generating function has a finite radius of convergence, say ffl, beyond which the series 
diverges. Hence when e s °' > 8% or t > \nM/so, the mean fitness becomes infinite. For 
example, let g(l) = (1 — p)p'~ l which yield G(z) = (1 —p)z/ (1 — pz) for pz<\ and 
infinite otherwise. Hence for t > — lnp /so, the fitness becomes infinite. The radius of 
convergence for this example is Si = I /p. Also note that the radius of convergence 
cannot be smaller than 1 because the generating function of probability is absolutely 
convergent for \z\ < 1 by definition. In the following, G(z) is assumed to be an entire 
function, that is, go(l) is assumed to decay faster than exponential in the asymptotic 
regime. 
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We now proceed to calculate the mean and variance of the number of accumulated 
mutations in the infinite population limit. First, the mean number of mutations is 
calculated as 



«(»)= —Infill) 



t-i 



=1 



y — 



1 



T=0 



G(e*° T ) ' 



(21) 



Since G(e s ° T ) grows at least exponentially with T, the second term approaches a finite 
value. Clearly Eq. (fJTJ gives the large population limit of the substitution rate k, 
defined here as the infinite time limit of n(t) /t: 



*=lim^2 = 

The variance of the number of mutations reads 



L. 



(22) 



8n(t) 2 = U 



lnf f ($,l) 



= 1 

5=1 ^=o 



uG(e s ° T ) 



(23) 



which has finite limit as t — > °o, 



3.3 Case studies 

Using the results presented above, we study the detailed evolution for two specific 
examples. To begin with, Sec. 13.3.11 studies the simple case that go(l) = <S/i which 
corresponds to © with Sf, = sq. Then in Sec. 13.3.21 we generalize our solution to a 
continuous fitness distribution such as ([S]). 



3.3.1 The case of a single selection coefficient 

When go(l) = the calculation is rather straightforward. Because the number of 
mutations fully specifies the fitness of a type, we replace f t [n,n) by f{n) throughout 
this subsection. From Eq. ( fT8l l. the mean fitness becomes 

lnw(f) = In (1 -U + U e sot ) ~ s ~ ^ jj) =^{t-t ) (24) 

with 

f = -lni (25) 
so U 

which gives Voo — sq. The mean number of mutations in the long time limit can be 
calculated from Eq. (fJTJ as 

n(t) V l - fat- I dx = t - — In 1 + — wf-k (26) 

T =o 1+MeJ ° T J l+we^ s U 



1 1 



where we approximate the summation by an integral assuming sq <C 1, and U <C 1. 
Not surprisingly, so«(0 ~ lnw(f ) in the long time limit. Likewise, the variance of the 
number of mutations is calculated as 

_ , , 2 £ «e s ° T 7 , ue S0T 1-17 

dnitf^V t k Ut =• = . (27) 

T =o(l+«e^) 2 J (1+ M e-* T ) 2 s 

Now we will show that the frequency distribution in the asymptotic limit can 
be approximated by a Gaussian. From Eq. ( fT2l with w w e s °v' \ the frequency at 
generation f can be approximated as 

/,(n)«/ f _i(n)e*<"-<'- 1 >-*>) 

« /„ («) exp E (n + t -t + t)j « /„ («) e - S o(»-m ) 2 /2 , (28) 

where « and f are assumed sufficiently large and we neglect the effect of mutations. 
Next we show that /„ (n) « eT^ol 2 at long times, which concludes the demonstration 
that f t (n) becomes Gaussian. Under the assumptions of our model the largest number 
of mutations accumulated by an individual up to f is f , and from Eq. $1% the frequency 
of such individuals is 

Since we* ' becomes larger than 1 at t w fo, the term «e Iot in the denominator of 
Eq. d29| ) makes a dominant (negligible) contribution for f > fo (f < fo)- Thus, we may 
approximate Eq. d29t in the long time limit as 

lim f,(t) « t/'° e' 50 ' ^- 1 )/ 2 « e -' V( "o/ 2 , (30) 
r — >oo 

which shows that / ( («) is well described by a travelling wave in the form of Gaussian. 

The above consideration gives an interesting criterion for the population size be- 
yond which the infinite population dynamics becomes valid. If the population size is 
larger than 

N c = exp(i fo/2) = exp[ln 2 t//(2s )], (31) 
the number of fittest individuals at a given generation is not smaller than 1 for all 
times f (note that f t (t) is a decreasing function of f). Since the selection coeffi- 
cient of the types with f mutations compared to the mean fitness is approximately 
e*° ; /w(f) -1«1 jJJ ^S> 1, we can neglect the possible loss of such a type by genetic 
drift even if it is rare, which means that the infinite population dynamics describes 
a finite population with N > N c . To provide an impression of how large N c is, we 
choose typical values sq = 0.02 and U = 10~ 5 , which give^A^ w 10 1439 . 

To include the effect of mutations, we use Eqs. d2oT ) and d2Tb to write the fre- 
quency distribution in the form 

m « 1 ~-r (- { r mf ) , (32) 

M ' y/2it(\-U)/s V \ 2(l-U)/s J' 



The more accurate value obtained by exact numerical calculation isN c »s 2 X 10 
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1300 1350 1400 1450 1000 1200 1400 1600 1800 2000 

n n 

Fig. 2 (a) Frequency distribution of the infinite population dynamics for the case of go(l) = &n with 
SO = 0.02 and U = 10~ 5 . The distributions are shown at t = 1900 (left), t = 1950 (middle), and t = 2000 
(right). The peak is located at t + (ln£/)/.v f» r- 575.65. (b) Plot of ln/,(n) at r = 2000 as a function of n 
in comparison to Eq. 132) . Only a tiny deviation around n = 2000 is visible. 



where the prefactor is fixed by normalization. Note that for sufficiently small U, 
Eq. ( f28T ) is consistent with Eq. d32l ). Figure |2] compares the numerically obtained 
frequency distribution with Eq. ( f32b for 17 = 10~ 5 and so = 0.02. 

The idea that evolution can be described as a traveling wave moving at constant 
speed along the fitness axis was first presented by Tsimring et al J54U30l . who con- 
sidered the continuous time version of the model with multiplicative mutations and 
a single selection coefficient. In the continuous time case the speed of evolution di- 
verges in the infinite population limit, because there is no bound on the number of 
mutations that a single individual can accumulate in a given time. A wave moving at 
finite speed is obtained only if the finite size of the population is introduced at least 
on the level of a lower cutoff on the frequency distribution. 



3.3.2 Solution for a continuous fitness space 

In this section, we explain how the above calculation can be generalized to a contin- 
uous fitness space. We will use Eq. ^ for the distribution of selection coefficients. 
To connect to the results in Sect. 13.11 we perform a change of variables such that 
e v = 1 + s, where x denotes the continuous version of £so- When s is drawn from 
g(P\s), the probability density for x becomes 

■pM^r-<-(^)H » 

Setting x = Isq, the corresponding discrete distribution is 

M0 - -nsoW «xp (- (S^-i)') e« a, ,34, 
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where ^K(so) is the normalization constant which approaches 1 as sq — > 0. We now 
follow Sect. 13. 11 and calculate G(s s °') as 



G(e^) = ^)£i5 exp - ^— - e' 




jt=l \ *6 / V V s b J I $b 

dx (^y^) exp [ - ( '— ) ) t x(,+i > f- (35) 

where we take so - * with so^ = * finite. Letting y 1 // 3 = (e v — l)/s^, the above inte- 
gral, say Wt, becomes 



W t = j (1 +stiyW) t e-'dy = j exp (-y + fln(l + ^y 1//3 )) dy. (36) 
o o 

Using the steepest descent method, this can be approximated as 

W, ~ exp f-y, + r In f 1 + s b y c /P )) , (37) 



where y c is the solution of the saddle point equation 

>V + We = "7j — > >V ~ g (38) 

Thus, the leading behavior of In W r becomes tlnt/fi, which along with Eq. (TT~8l > yields 

rlnf 

lnw(f) ~ W, ~ -g- (39) 

for any finite j3. On the other hand, for /3 — > °° (f36t yields W ( = (1 +S/,)' and hence 
lnw(f) ~ t. 

For J3 = 1, a more accurate approximation can be found in Ref. l44l . Here we 
observe that the speed lnvv(f)/f increases logarithmically with time irrespective of 
the value of . The linear relation d2TT i for the mean number of beneficial mutations 
is still valid, because the summation on the right hand side of Eq. (f2TT > approaches to 
a non-negative finite number in the continuum limit so — * 0. 

We conclude that the speed of evolution is infinite in the infinite population limit 
for distributions of selection coefficients like ([8]), which have unbounded support. Su- 
perficially this is reminiscent of the situation in the continuous time model with a 
single selection coefficient considered in 15411301 . but it is important to note that the 
reasons for the divergence of the speed are quite different in the two cases. In the 
continuous time setting the speed diverges because the number of mutations accumu- 
lated in a given time is unbounded, whereas in the discrete time model the divergence 
reflects that a single mutation can have an arbitrarily large effect. We will encounter 
a similar dichotomy in the discussion of the finite population dynamics in the next 
section. 
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4 Finite populations 

4.1 Genetic drift, fixation and clonal interference 

Consider a single beneficial mutation with selection coefficient s > which is in- 
troduced into an initially homogeneous population. Following the evolution of the 
population under WF dynamics without allowing for further mutations (U = 0), one 
can distinguish different time scales. The survival of the mutation during the first few 
generations is very fragile, due to the stochasticity of the reproduction process: the 
number of individuals carrying the mutation is small and the variance is of the same 
order as the mean. These fluctuations are called genetic drift. After this drift phase, 
either the mutation goes extinct (with some probability 1 — 7In(s)) or the number of 
individuals becomes large enough (with probability 7In(s)), so that stochastic fluc- 
tuations can then be neglected and the evolution can be considered deterministic. A 
mutation that has reached the latter regime is called established^ j39 U16l[T7l . and it 
will (in the absence of other mutations) eventually take over the entire population. 
This process is called fixation, Kn(s) is the fixation probability, and the time needed 
for a mutation that survives to spread all over the population is the fixation time t^ x . 
The fixation probability for the Moran model is given by lfl2l 

Ms) = i + s- { i + s)-^y (40) 

but for the Wright-Fisher model only approximate expressions are available l53l [D. 
A widely used formula is l33l 



1 - 



_-2i 



= l- e -2N s - ( 41 ) 

For 5^0 both (l40l > and fill reduce to 7Zn = l/N, as is obvious from a symmetry ar- 
gument: When the fitness of the mutant is equal to that of the background population, 
the probability of fixation is the same for all N individuals. Both expressions show 
that the fixation of deleterious mutations (s < 0) is exponentially suppressed for large 
N, while the fixation probability for beneficial mutations becomes independent of N, 
reducing for fill to 

7J^(s) = n(s) = 1 -e- 2s . (42) 
When s is small (as will often be the case) this can be further simplified to 

n(s) ~ 2s, (43) 

while n(s) ~ s for the Moran model. 

In the limit N — > °° the restriction on the size of the growing mutant clone is irrel- 
evant and the WF-model reduces to0 a Bienayme-Galton- Watson branching process 
with a Poisson offspring distribution of mean 1 + s. The fixation probability is then 



7 Although this terminology is mathematically ambiguous, it is widely used in the community because, 
we think, it is inspirational. 

8 The derivation of the WF model from the branching process in Sect.|2]easily explains this connection. 
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equal to the survival probability of the branching process, which satisfies the implicit 
relation IMElTfl 

7t=l-e- {l+s ^. (44) 

Expanding Eq. (l44l to second order we recover Eq. (03]) for small s, but for large s the 
exact fixation probability approaches unity as 1 — % w e~( 1+s \ in contrast to Eq. ( |42] |. 
In the following we nevertheless use (l42l i when values of the fixation probability are 
required for the full range of selection coefficients, and d43l when s is small. 

The approximation by a branching process is also useful in deriving a heuristic 
estimate of the population size required for a mutant clone to become established (39| . 
In this approximation the average population size of the clone grows as (1 + sf f» e st . 
However, since this average includes also instances where the clone goes extinct (with 
probability 1 — tt), the population size conditioned on survival of the clone is larger 
by a factor \/n~ I /2s. Such a clone thus looks as if it started out containing already 
~ 1 /2s individuals, which is precisely the threshold size separating stochastic from 
deterministic growth (see e.g. (9][6) for a detailed treatment of this point). 

In order to get some intuition about the fixation time t^x, one can look at the 
deterministic evolution of a mutant of type A that appears in a population consisting of 
the "wild type" B. The fitnesses can be taken as wa = (1 + s) and wb — 1. We assume 
that the type A has survived genetic drift and we have a frequency a t of individuals 
of type A and b, = 1 — a t of individuals of type B. The deterministic evolution is thus 
given by 

:l t+\ — lPf "t, 

Vi = &t, (45) 
w, = (\+s)a t + b t 

b,= a t = l-b t . (46) 

ao(l +s)' + bo 

For a finite population of large size N, the type B can be considered as extinct when 
b t = l/N. With the initial condition of a single mutant, «o = l/N, this expression 
gives the fixation time for large as 

fee ?ifzi)^ (47) 

ln(l+s) s 

when s is small. 

For later purposes, we also need the total number of individuals of type B that 
have existed during the fixation time of A. We note that 

(l+s)* 6 *-' N-l 

a ^-< - (i +jv-i = (i+sy+N-i = b ' (48) 

where we have used (1 +s)' fix = (N — l) 2 . We can thus conclude that, during the 
fixation of type A, one has / a t dt ~ / b t dt , so that the total number of individuals of 
type B is ~ Nt &x /2 ~ iVlniV/ln(l +s). 

This simple example shows the dependence of t^ x on when the mutation rate U 
is set to after the emergence of the mutant type A. If U is non-zero, the expression 



and the solution is 
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'fix 



Fig. 3 In the periodic selection regime ( mut S> fft x and beneficial mutations fix independently of each other. 
Each blue line represents a selective sweep. 



for ffi X is valid only as long as U is small enough, so that no new mutation emerges 
before the fixation of the previous one. The average time between two mutations that 
survive genetic drift is 

^mut 7777 7T ~~ ^77777~ ' (49) 

NUn(s) 2NUs 

If fmut > ffix, i-e. if 

NlnNU < 1 (50) 

for s small, then no mutation interferes and we are in the periodic selection regime 
for which 

v N = — oc S 2 NU. (51) 

^mut 

This situation is sketched in Fig. [3] The main feature of this regime is that the selec- 
tive sweeps associated with different beneficial mutations are independent and well 
separated in time, and therefore the speed of evolution is directly proportional to the 
supply of beneficial mutations NU. 

On the other hand, if f mut and t^ x are of the same order, then mutations can occur 
during the fixation process of previous mutations [39 60 ] and the distinction between 
fmut and ffl x becomes unclear. In Fig.|4] we present an example showing how the popu- 
lation dynamics changes when the criterion d50l l is violated. Following l2D we refer 
to the interaction among beneficial mutant clones in this regime as clonal interfer- 
ence. 

In the remaining parts of this section we present the main analytic approaches 
that have been developed to compute the speed of evolution in the clonal interference 
regime. We begin by considering the case where all mutations have the same selection 
coefficient {model I) and then treat the case of a continuous distribution of selection 
coefficients^ (model II). 



4.2 Model I: Single selection coefficient of beneficial mutations 
4.2.1 The Crow-Kimura-Felsenstein approach 

The first attempt to compute the speed of evolution in the presence of clonal interfer- 
ence is due to Crow and Kimura Q. We present their calculation in the form given by 

9 Note that in part of the literature [9 6 49] the term "clonal interference" is restricted to model II. 
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Fig. 4 The frequencies of the five most populated genotypes are shown in different colors for the WF 
model using the distribution (8) with j3 = 1, U = 10~ 6 , Sj, = 0.02. The population sizes are (a) N = 10 4 , 
(b)N= 10 5 ,(c)N = 10 6 , and(d)W= 10 7 , respectively. FromiV= 10 5 onward, where NU\n N~ 1.15, the 
third most populated genotype becomes visible and the distinction between ; rallt and becomes blurred, 
which signals the onset of clonal interference. 



Felsenstein 1161, which takes into account that only established mutations contribute 
to the adaptation process. Such mutations (with selection coefficient sy) appear in the 
population at rate n(s[,)NU. Assuming that a mutation was established at time t = 0, 
we now ask for the waiting time T until a second mutation is established in the off- 
spring of the first. We take s/, to be small, such that n(st,) ~ 2s& and (1 +st,)' ss e Sb ' . 
Then, according to d4oT ), the frequency a t of the mutant starting at oq = 1/ (2st,N) is 



1 



l + {2s h N- \)e- s b< 



(52) 



The number of mutants at time t is Na t , and each mutant generates an established 
second mutant with probability 2st,U per generation. We therefore need to compute 
the accumulated number of mutants A^cc that have existed up to time t , where each 
individual is weighted by the number of generations during which it has existed. Ap- 
proximating the sum over generations by an integral, this is given by 



f N 
:N I dt' a t i = — In 

J s h 



e s b> 
2Ns~ b 



1- 



2Ns h 









+ 1 




2Ns b 



(53) 
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for Ns 3> 1 ■ The waiting time T is then determined from the condition 2sbUN aC c ( *) = 
1, which yields the speed 

„2 



V*=-Z= T7T7^ (54) 



.CKF _ ffc _ ^ 

T - ln[(2AT^)( e V2£/iV-l 

For small AT (C/JV -C 1) this reduces to the expression ( Bit valid in the periodic selec- 
tion regime, while for large N a finite speed limit v„ = s?/ln(,y&/ U ) is reached. 

In writing the relation ( l54l i it is implicitly assumed that the situation at the apper- 
ance of the second mutation is identical to that at the appearance of the first, which is 
not true: the second mutation competes against a background consisting of a mixture 
of mutant and wild type with mean fitness (relative to the wild type fitness of unity) 
w = (1 + Sb)a t + (1 — a t ) = 1 + a t si, < 1 + si,- The selective advantage of the second 
mutant compared to the background population is therefore larger than si,, and it will 
grow faster than the first mutant population. For this reason the expression (l54t is 
a lower bound on the actual speed. To improve on this bound we need to take into 
account the coexistence of several mutant clones in the population, which will be the 
subject of the next subsection. 

4.2.2 The traveling wave approach 

As the discussion in Sec. 13.3.11 shows, the deterministic evolution of an infinite pop- 
ulation is well described as a travelling wave of approximately Gaussian shape. In 
order to extend this approach to large but finite populations, the deterministic dynam- 
ics of the bulk of the wave is combined with a stochastic description of the appear- 
ance of new mutants at the high-fitness edge of the frequency distribution. This idea 
was first proposed by Rouzine, Wakeley and Coffine l50l and has since been further 
elaborated [9 49 6 1 . In this section we follow the particularly simple and transparent 
derivation presented in J2] . 

As in Sec. 13. 3. fl we denote by f t (n) the frequency of individuals with n mutations, 
and assume for this distribution the Gaussian form 

ft \ 1 ( {n-v N t/s b ) 2 \ .... 

Here we have used that the mean number of mutations acquired up to time t is 
h = v^t/sb- The speed v# and the variance O 2 of the travelling wave are related by 
Fisher's fundamental theorem or, more generally, by the Guess relation ©■ Neglect- 
ing the direct mutation contribution f/ln(l +s b ) because of U <g; 1 and evaluating the 
selection term using the approximation ~ 1 + Sb( n i — n) (where is the number of 
mutations acuired by individual z) we see that 

v N ^sja 2 , (56) 

which is also true for the infinite population case when U <C 1 (compare to (|23l). 

It is clear that at any finite time t , there is a maximal number of mutations « max (0 
such that f t (n) =0 for n > n max (t). Let 

L(t) = n max (t)~—\nw{t) (57) 

Sb 
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denote the lead of this class of fittest individuals relative to the mean population 
fitness (9j|. Let t n be the generation when « max = n for the first time. We assume that 
(L(t„)) — > Lo as t — > °° and (t n+ i — t n ) — > T, with constant Lo an d T > which reflects the 
existence of a stationary travelling wave with speed 

vjv = s b /x, (58) 

compare to (l54l) , For times t„ < t < f„+i> L(t) then behaves as Lo — lnw/jft ~ Lo — 
VNt/sb- We further assume that, for very large AT, the lead satisfies L(t)s b ^> 1, which 
implies that the loss by genetic drift of new mutants arising from the most fit class 
can be neglected. Analogous to Sect l4.2.fl we can now compute the accumulated 
number of mutants in the most fit class that have existed during the time t„ < t < t n+ \ 
according to 

AWT) = £ exp [s £ L(u) = £ > « - _ « e 1 ^, (59) 

f=l \ u=\ ) f=l 1 e 

which is a good approximation if vjvT = S/, -C Los/, or Lo S> 1. 

The mean waiting time until the appearance of a mutant with n max + 1 mutations 
is the solution of the equation Nn CC (z)U = 1, which yields 

1 -M ^ V (60) 



L s b V u 

Finally, we close the system of relations by noting that, as long as N is not extremely 
largqlj, the new fittest class will most likely appear as a single individual, which 
implies that 



_ L o 1 / Nsu \ 1/2 

_=e 5* = -->Io*4=(2vtfln-=i= . (61) 

From Eqs. d56l l, d60l l, and d6TT >, v/y becomes the solution of the equation 



25 2 ,ln( *M 
(mil) 1 



which leads to the final result 



..Gauss _ HHN ) 

(InU) 



, .Gauss b V / //r-2\ 

V N = «_ 77vj (63) 



for very large N. The logarithmic growth of the speed with N must saturate when 
the infinite population limit vjv = Sb is approached. According to d63l l this happens 
when N ~ N c ~ e' ln£/ ) l 2Sb , in agreement with the estimate (I3TI 1. For population sizes 
exceeding A^ the relation (|6TT > is no longer valid, because the initial frequency of the 
fittest genotype at t n can be much larger than 1 /N once N 3> N c . The existence of 
an absolute speed limit vjv = $b is evident from ( l58l . because z cannot be smaller 



More precisely, N <CiV c . 
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than one generation time in the discrete time model. For models with overlapping 
generations such a restriction does not exist, because a larger number of offspring 
can be generated within much less than an average generation time, and the speed 
increases proportional to InN for arbitrary N. 

In this context, it is instructive to compare the discrete and the continuous time 
models in different population size regimes. When the population size is small (NU <C 
1), there is a slight difference between these two models. For example, the fixation 
probability for small s is Cs with a model dependent constant C (compare (|40| | and 
(HTt). Once the population becomes large enough so that the loss of the fittest type 
by genetic drift can be neglected, there is no difference between the continuous and 
discrete time models. However, for very large AT > N c , there is a large difference due 
to the restriction T > 1 in the discrete time model. 



4.2.3 Comparison to simulations 

The above derivation of the speed of evolution involves a number of rough, uncon- 
trolled approximations, such that the result d63l can hardly be expected to be quanti- 
tatively accurate. A much more careful analysis along similar lines was presented by 
Rouzine, Brunet and Wilke (RBW) (49), who find the implicit expressiorF'l 



RBW / ,,RBW \ / „3 



M^^V ln 2 ^ — + 1 -In J PRW /^ RW -. (64) 

1s\ \ eUs b J V v* BW ln(vR BW /(t/^)) 

A related approach, which however does not explicitly use the Gaussian shape of the 
deterministic part of the travelling wave, was presented by Desai and Fisher [9|, who 
find3 

V N W — = . (CD) 

N In 2 (U/s b ) 

In Fig. [5] we compare the different theories with simulation results for the WF model. 
For moderately large population size, Eqs.(l64l> and $65[ are of comparable quality, but 
for extremely large population, as shown in the inset of Fig. [5] the predictive power 
of Eq. d64l ) is superior to the other approaches. 

In the asymptotic regime, Eq. ( f64t predicts that vjy ~ lnA^/ln 2 lnA^, but Eqs. d63l 
and d65l ) predict vn ~ \nN. Rigorous work 1 64 63 1 has established that the speed 
in the asymptotic regime is not smaller than ffi}v}~ s N) for any positive 8, which 
does not exclude the possibility of a multiplicative In 2 lnA^-correction. Even with the 
dedicated algorithm used to generate the data in the inset of Fig. |5J it seems hardly 
possible to settle this issue using numerical simulations. 



stU 



" The speed V in Ref. 1491 is Vtf/si,. To conform to our notation, we slightly modified Eq. (52) in 
Ref. (49). 

12 A detailed analysis of this approach can be found in (6). 
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Fig. 5 Comparison of the theoretical expressions Eqs. fl65t . )63t . )64t . and <54t with simulations using 
s/ } = 0.02 and U = 10~ 6 . In the inset the comparison is extended up to N = 10 300 except for Eq. {54}, 
which predicts a speed limit. The algorithm used to obtain these data is described in the Appendix. 



4.3 Model II: Continuous distribution of selection coefficients 

In Sect. I4.2I we reviewed theories aimed at calculating the speed of evolution when 
the selection coefficient takes a single value (model I). In this subsection, we will 
allow the selection coefficient to take a continuous range of values drawn from a 
distribution like Eq. (O (model II). Unlike model I, two mutants arising from the 
same progenitor now have different selection coefficients and selection is operative 
between these two mutations. In contrast, in model I the competition between two 
mutants derived from a single progenitor is purely stochastic, and selection operates 
only between clones that have accumulated a different number of mutations. 

The qualitative picture of a wave of fixed shape traveling along the fitness axis 
that we developed for model I is expected to apply to model II as well, but it is 
more difficult to quantify, because continuous fitness cannot be reduced to the dis- 
crete number of acquired mutations. Two approaches have so far been proposed to 
deal with this problem. The first is related to an "equivalence principle" discovered 
in microbial evolution experiments l26l . which suggests that a given distribution of 
selection coefficients can be represented by an effective single selection coefficient 
along with a suitably rescaled effective mutation rate. A heuristic scheme to imple- 
ment this idea was given in |9l and tested against numerical simulations in fl9l . As 
one might expect, the representation by a single "predominant" selection coefficient 
is quantitatively accurate only if the distribution is very narrow, such as g$> with 
J3 = 10, and it fails completely when j3 < 1 lfl9l. 

The second approach, first proposed by Gerrish and Lenski (GL) 1211 . attempts to 
extend the periodic selection picture into the clonal interference regime by focusing 
on mutations of exceptionally large effect. Clonal interference is seen as a filter that 
eliminates mutations whose effect is small enough to be superseded by a mutation 
of larger effect arising later in the process. Once the size of selection coefficients 
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Fig. 6 Plots of mean fitness corresponding to the two panels on the right hand side of Fig. [4] The popu- 
lation sizes are (a) 10 5 and (b) 10 7 . Although assumptions of the GL approach are not strictly applicable, 
one observes regions where the mean fitness behaves in a step-like fashion. 

of mutations that survive the competition by other clones has been identified, along 
with the rate at which such mutations appear, the speed of evolution is obtained from 
a simple relation similar to Eq. (f5TJ used in the periodic selection regime. 

In the following we outline the GL approach, derive its asymptotic predictions, 
and compare it to simulation results. 

4.3.1 Gerrish-Lenski Theory 

The GL-theory is based on two assumptions [21. 44 1 . First, the type of any individ- 
ual at any time is either the wild type or a mutant derived directly from the wild 
type. The contributions from multiple mutations arising from an extant mutant are 
neglected. Since the fixation of a mutation under this assumption becomes a renewal 
process l20l . we will refer to this assumption as the renewal assumption. Second, the 
loss of a beneficial mutation by stochastic sampling error when rare (genetic drift) is 
determined solely by its selection coefficient compared to the wild type. Other bene- 
ficial mutations do not play any role in determining the fate of the mutation at early 
times. We will refer to this assumption as the assumption of establishment. 

The picture underlying these two assumptions is that the adaptive process can 
still be decomposed into separate selective sweeps in which a mutation grows in a 
fixed background and eventually takes over the population (compare to FigfJ). A 
signature of this kind of dynamics is a step-like increase of the mean fitness. As can 
be seen in Fig. |4] this step-like behavior is pronounced for small populations in the 
periodic selection regime. However, as the population size increases, the mean fitness 
becomes more and more smooth, see Fig. [6] although distinct steps still occur when a 
mutation of exceptionally large strength appears. Thus, the GL approach is expected 
to be useful in a restricted range of population sizes, which goes slightly beyond 
the periodic selection regime. It is similar in spirit to the Crow-Kimura-Felsenstein 
approach reviewed in Sect l4.2.fl which also successfully captures the slowing down 
of adaptation near the onset of clonal interference but fails for larger N (compare to 
FigJ5). 
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To formulate the GL-theory quantitatively, we make use of two functions intro- 
duced previously: the probability distribution g(s) of selection coefficients (like g@' 
in Eq. ((HJ), and the probability 7l(s) for the fixation (or, equivalently, the establish- 
ment) of a mutation of strength s. By the assumption of establishment, the distribu- 
tion of the mutations that can spread in the population after the initial fluctuations 
and are really competing is then given b)Q % (s)g(s). 

For a mutation with selection coefficient s to be fixed, it is necessary that no fitter 
mutation is established during the time required for the first mutation to fix. The 
expected number of established fitter mutations that appear during this time is 

X{s) = (NUtsx(s)/2) J dun{u)g{u) (66) 

wher J^l ffiv (s) is given in Eq. d47b . the factor 1/2 comes from the renewal assumption 
[see also discussion below Eq. d48bl, and the integral gives the probability that the 
selection coefficient of an established mutation is larger than s. Note that the renewal 
assumption prohibits a secondary mutation with selection coefficient s" arising in the 
offspring of a primary mutation s' with s' < s but s' + s" > s, which would make 
Eq. ( l66l l much more complicated. Hence, within the GL approximation the probabil- 
ity of not encountering any fitter mutation during fixation is exp(— A(s)) and, accord- 
ingly, the fixation probability of a mutation with selection coefficient s becomes 

P &x (s) = n(s)g(s)exp ^- j^ 1 "^ J %{u)g{u)du^ . (67) 

In words, for a mutation with selection coefficient s to be fixed, it must first survive 
genetic drift (with probability n(s)g(s)), then should outcompete all other mutations 
(with probability exp(— A (s))). Thus, the substitution rate (the number of fixed muta- 
tions per generation) is 



k eS =NU J Pn x (s)ds. (68) 

s=0 

To calculate the speed Vjy, we need the mean selection coefficient of fixed mutations 
which is readily obtained as 

fsPfi x (s)ds 

eff = ~T~d~TT7~ ■ ( 69 ) 
Pfi x (s)ds 



Along with £ e ff this determines the speed according tq^j [60] 

v£ L =fc eff ln(l+5eff). (70) 



13 Without the assumption of the establishment, the survival probability of a mutation should also depend 
on the population structure at the time when this mutation arises. 

14 Note that in previous work on the GL approach the expression ffj x = 2\nN/s was used irrespective of 
the size of s I21I60I441 . We will get back to the consequences of this replacement in Sect. 14.3.21 

15 The reader may wonder why ln( 1 + s e ff ) on the right hand side of 1701 is not replaced by the average 
of ln(l +s) with respect to Psx(s). For small s e g the difference between the two is obviously neglible, but 
the same is true when s c f t 2> 1, because then Pfj x becomes very narrow due to clonal inteference. For a 
numerical test of \7Q\ see 1441 . 
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Fig. 7 Comparison of the GL theory with the simulation results of the WF model for U = 10 6 and mean 
selection coefficient T(l + 1/J3)j 4 = 0.02 for (a) j3 = \, (b) /} = 1, and (c) j8 = 2. Panel (d) shows the 
data from (a)-(c) in double logarithmic scales. 



In Fig. [7] we compare Eq. (T70T > to simulations using g" 3 ' with three different values 
of j3. The integrals in d67l ), d68l l and (l69l i were evaluated numerically. We see that 
the GL approach is remarkably accurate also beyond the periodic selection regime, 
as becomes evident by comparing the double-logarithmic graph in Fig|7Id) to the 
corresponding Figure [5] for model I. However the deviations grow as N increases, 
in particular for j5 — 2, where the GL-prediction shows a negative curvature in InN 
which is not present in the simulation data. We will return to this point at the end 
of the next subsection. As the numerical scheme employed for model I relies on the 
discreteness of the fitness space (see Appendix), we have no information about the 
behavior of vn for very large N. Since we have shown in Sect l3.3.2l that the speed of 
evolution is infinite in the infinite population model, we merely know that (in contrast 
to model I) linijv->°o \'n = °°- 



4.3.2 Asymptotic Behavior of the GL theory 

Although we cannot analytically evaluate the expression for vjv predicted by the GL 
approach, an accurate asymptotic approximation can be derived, which is the topic 
of this section. Throughout the distribution of selection coefficients is used. The 
calculation follows the idea presented (for /3 = 1) in Ref. l44l : see also Ref. ||60l . The 
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only difference is that ffi x = 21nA^/ln(l +s) is used rather than 2lnN/s, which will 
turn out to affect the conclusion significantly. 

The integrations involved in the GL theory take the form 

oo 

I[A;n] = J dss n f(s)exp(-Ah(s)), (71) 


where f(s) = n(s)g^\s), h(s) = J^f(u)du/\a(l+s), and A = NU InN. Note that 
h(s) is a decreasing function with the range [0,°°]. By the change of variable y = h(s), 
the above integral becomes 

I[A;n] = jv{y)e- A ydy, (72) 
o 

where 

y W'))=^ = ^ 1+ ') + T^(l to *w) _1 - (73) 

To arrive at Eq. J72K we have used f(s) = —(d/ds)(1n(l +s)h(s)) and the fact that 
h(s) is a (monotonic) decreasing function. As A — > oo, /[A;n] is dominated by the 
contribution around y = 0, or equivalently around s = oo, When s is very large, we 
can approximate n(s) ~ 1 and hence h(s) ~ exp(— (s/sb)P)/]ns. Hence for large s, 
we can approximate s = Sb(— lny) 1 /' 3 (y -C 1), and 

«F(y)«^ln^-/ , - 1 |(^)^~ 1 «^(-lnyr^ln(ln(l/);))/j3 ! (74) 

where we have kept only the leading order term. Hence 

I[A;n] « ^ / (-lny)"^lnln(l/y)e- A ^y« ^|(lnA)"^ lnlnA. (75) 
o A/J 
The substitution rate is then 

^ = NUI[NU^N;0] = ln ^ nN \ (76) 

which, as iV — > oo, approaches for any j3. The asymptotic behavior of the speed is 
gl , , A , (InlnjV) 2 

which also approaches as iV — > oo. 

This asymptotic behavior is easily understandable from an extremal statistics ar- 
gument. The maximal mutation coefficient s max observed over mutation events is 
given approximatively as a solution of 



Prob{s > s max } = [ g(u)du = exp (-(s iasx ./s b ) p ) ~ — 



(78) 
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Fig. 8 Comparison of the GL theory using fg x = 2 In N / In ( 1 + s) (symbols) with that using = 2 In N/s 
(line) for U = 10~ 6 , Sb = 0.02, and j8 = 1. Inset: Close up of the boxed area. As anticipated by the asymp- 
totic analysis, the speed decays, though slowly, with N. 



Following the GL hypothesis, the selection coefficient that gets fixed has to be the 
maximum of all selection coefficients that appear within its own fixation time, i.e. 
one has to consider a typical number 

je~NUt Bx ~NU\nN/lnO-+s aiaL ) (79) 

of mutations. Thus, the leading behavior of s max becomes s max ~ 
the effective substitution rate is given by (up to leading order) 

kcs « 1 Afix(w) ~ ln '" ( y (80) 
plnN 

as in Eq. ( f76b . and the velocity is the same as in Eq. ( 1771 ). 

The asymptotic behavior obtained in this section is completely different from 
previous report^ i2lTl60ll44l . The reason is clearly the factor ln(l +s) in the denom- 
inator of ffix(s), which is very different from s when s e ff 3> 1. However, this effect is 
only relevant when N is extremely large. As Fig.[8]shows, the true asymptotic behav- 
ior dTTJ is approached only when N > 10 100 for U = 1(T 6 and s b = 0.02 with J3 = 1, 
and the difference between using ln(l +s) and s in f nx (^) is small when < 10 20 . So 
for realistic values of N, replacing ln(l + s) by s can provide a good approximation 
for the speed. 

In fact, if the mutant fitness is derived from the parental fitness by multiplication 
with e s rather than with 1 + s, which would correspond to a continuous time picture, 
the fixation time is 2lnN/s for all s. The speed is then given by the expression vn = 
keffSes used in Ref. ||2T1 . rather than by Eq. (TTOb . The leading asymptotic behavior of 
GL theory within this scheme can be obtained along the lines of l44l or, more directly, 
by adapting the extremal statistics argument given above. Since the leading behavior 



16 Note that in the original paper of Gerrish and Lenski 1211 , it was erroneously concluded that v# 
approaches a finite "speed limit" for N — » °°. 
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Fig. 9 Fixation of multiple mutations in a population of size N = 5. At time t, four types are present, and 
only the red mutation is fixed (= shared by all individuals). In the next generation, the individuals with 
one and two mutations leave no offspring, and consequently the blue and the green mutation go to fixation 
simultaneously. 

Of ■Smax is the same as before, the asymptotic behavior becomes 

*eff~ w/lnjV^ln 1 // 3 - 1 ^ v^-slln^-'N. (81) 

Thus the graph of vjy versus InN within GL theory is positively curved when j3 < 1 
and negatively curved when j8 > 1, as is visible in Figs|7] (a)-(c). The simulation 
results for /3 = 2 do however not share this feature, and lie distinctly above the GL 
prediction for large N. In the next subsection we elaborate on this observation. 

4.3.3 Importance of multiple mutations 

It is instructive to compare dSTl i to the result ~ s\\\\N obtained for model I in 
Sect l4.2l Evidently, the speed in model I should be minimal among all distributions 
g(s) with the same mean selection coefficient, which implies that v/y should increase 
at least as fast as lnjV also for model if^l However, according to ( T8TT > vn grows more 
slowly than InAf when /3 > 1, and even decreases with N when /3 > 2. Moreover, the 
rate of substitution decreases with increasing InAf for /3 > 1, although we know that 
k — > 1 in the infinite population limit. 

This is not really surprising, as the GL approach takes into account only the muta- 
tions of largest effect, ignoring the cumulative effect of multiple mutations of average 
effect which drive the dynamics in model I. On the basis of dSTl i. one might speculate 
that the evolutionary process is dominated by large, extremal selection coefficients 
when j3 < 1, and by multiple mutations of typical effect when /3 > 1. This could 
also account for the breakdown of the "predominant mutation" approach for j3 < 1 
||9l [T9l . Interestingly, the exponential distribution of selection coefficients, which is 
most widely used in this context [21 42 60 44], would then turn out to represent the 
marginal case separating the two regimes. 

A quantitative measure of the importance of multiple mutations in the evolution- 
ary dynamics can be obtained by asking how many mutations typically go to fixation 
in a single fixation event. The way in which the fixation of different mutations can 
become linked is illustrated in Fig|9] It was observed numerically in 0441 (for model 

17 For the purpose of this discussion we ignore the saturation of the speed that occurs at extremely large 
N in model I. 
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Fig. 10 Left : Distribution of the number of fixed mutations per fixation events for /3 = <» with si, = 0.02 
and U = 10~ 6 in semi-logarithmic scales. From left to right, the population sizes are 10 3 , 10 4 , 10 , 10 6 , 
10 7 , and 10 s . Clean geometric distributions are observed. Right : Mean number of fixed mutations per 
fixation event (\/q) as a function of population size. 



II with /3 = 1) that the probability /„ of n mutations fixing in a single event is well 
described by a geometric distribution, 

J n = (l-q)"- l q. (82) 

The left panel of Fig[l0] shows that the same relationship holds for model I. The pa- 
rameter 1 /q is the mean number of simultaneously fixed mutations, and it increases 
with A' in a logarithmic fashion (right panel of FigJTOb. As expected, mutiple muta- 
tions are more prevalent for larger j3, but there does not seem to be any qualitative 
difference in the behaviors for /3 < 1 and j3 > 1 . An analytic understanding of the re- 
lation ( 1821 is so far only available for the case without selection, where 1 jq increase 
linearly with population size N |58|59| . We note, finally, that the time series of fix- 
ation events has interesting statistical properties [20,44], which are however outside 
the scope of the present article. 



5 Summary and outlook 

In this paper we have reviewed some aspects of evolutionary dynamics in the arguably 
simplest setting: A population of fixed size N evolving in a time-independent envi- 
ronment, supplied by independently acting beneficial mutations at a constant rate U . 
The quantity of primary interest is the speed Vjy of logarithmic fitness increase, which 
is determined by the parameters A' and U and by the probability distribution g(s) of 
mutational effects with typical scale si,. 

On a qualitative level, one finds three distinct evolutionary regimes. For small 
populations, in the sense of 150) . beneficial mutations are well separated in time and 
sweep through the population independently. As a consequence, the speed vjv is pro- 
portional to NU . For larger populations the clones generated by different mutations 
interfere and the increase of the speed is only logarithmic in N. Finally, in the limit 
of infinite populations, the speed saturates to a finite value (for the discrete time WF 
model). In the last regime the problem can be solved exactly, but, due to a conspiracy 
of the small parameters U and this description applies only to hyperastronomically 
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large populations (see Eq. (OTTi). Real microbial populations of the kind used in evo- 
lution experiments typically operate in the intermediate clonal interference regime, 
which has been the main focus of the article. 

Most work on the finite population problem has considered the case of a single 
selection coefficient (model I), where fitness is discrete. This offers considerable ad- 
vantages for both approximate and rigorous analytic studies as well as for numerical 
simulations, which are able to exlore the asymptotic regime where InN (and not just 
N) is large. A summary of the present state of affairs with regard to analytic approx- 
imations for the speed is given in Fig. [5] The case of a continuous distribution of 
selection coefficients (model II) is less well understood. Despite its conceptual short- 
comings, the Gerrish-Lenski approximation provides a quantitatively rather satisfac- 
tory description of the speed over the experimentally relevant range of population 
sizes (see Fig. |7J, although it fails completely when the infinite population limit is 
approached. We have argued above that model I should provide a lower bound on the 
speed of evolution for general distributions of selection coefficients, which implies 
that the speed increases at least as fast as InN also for model II, and possibly faster 
for distributions g(s) that decay more slowly than exponentially. 

The unifying paradigm used throughout the article is the description of the evolu- 
tionary process in terms of a traveling wave of constant shape moving along the fit- 
ness axis 15411501 . This idea has proved to be successful also in the related but distinct 
context of competitive evolution, where selection is decoupled from reproduction l46l 
[361 . Competitive evolution models mimic a process of artificial (rather than natural) 
selection, where the character ("trait") of the types that is being selected is not the 
reproductive ability (fitness) of individuals. In one variant, individuals are assigned a 
scalar trait which is handed on to the offspring subject to random mutations. In one 
round of reproduction, each individual creates the same number of offspring, and sub- 
sequently the N with the largest value of the trait are selected for the next round (5). 
This model falls into the large class of noisy traveling waves of Fisher-Kolmogorov 
type 152II43I I41. which are much better understood than the problems described in the 
present article. Apart from accurate analytic approximations to the wave speed, also 
the genealogies of populations can be addressed, which display an interesting rela- 
tion to the statistical physics of disordred systems |4|. In contrast, the genealogical 
properties of the WF model with selection are largely unknown. 

Although the models described here are of considerable interest for the interpre- 
tation of evolutionary experiments [10,13, 261 I4T1I5T1I5411561 , the reader should not be 
left with the impression that they provide a description that is realistic in all or even 
most respects. For example, the assumption of a constant supply of beneficial muta- 
tions cannot be true at arbitrarily long times, and indeed the rate of fitness increase 
is generally observed to slow down in experiments l57l . One way to take this effect 
into account is by modeling the genotype as a sequence with a finite number of sites 
at which mutations can take place iTJTl . 

Another approach, known as Kingman's house-of-cards model 1351 . retains the 
infinite number of sites approximation but modifies the basic mutation step such 
that the mutant fitness w\ itself is drawn from a fixed fitness distribution g(w). The 
probability of chosing a beneficial mutation with W t > w, then decreases as the mean 
fitness grows, and correspondingly the logarithmic fitness increases in a sublinear 
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manner determined by the tail of g |45l . In fact this problem turns out to be simpler 
than the one discussed in the present article, because the diminishing rate of benefi- 
cial mutations (U — > 0) drives the system into the periodic selection regime where 
selective sweeps can be treated as independent. 

Kingman's assumption that the fitness of the offspring is uncorrelated with that of 
the parent is hardly more realistic than the assumption of independent fitness effects 
of different mutations which underlies Eq. @). The few examples available so far 
indicate that real fitness landscapes lie between these two extremes I148II55I . which 
implies that the structure of the type space cannot be ignored. Like the modeling 
of evolutionary dynamics which we have discussed in this article, the mathematical 
characterization of such fitness landscapes offers a host of challenging problems that 
can be fruitfully explored by statistical physicists. 
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A Simulating the Wright-Fisher model 

This Appendix is devoted to explaining how we simulated model I for population sizes up to 10 300 , as 
displayed in Fig. [5] The algorithm is based on that of 1441 . which we describe first. As in Sec. [3] we denote 
the frequency of individuals with fitness e'" 4 at generation t by f t (n). Assume that at time ( there are k+ 1 
distinct fitness values present in the population, i.e. //(«)= if n < no or n > no + k. It is straightforward 
to see from Eq. {3) that the number m\ of individuals having rij = no + i (i = 1,...,&+1) mutations at 
generation r + 1 is determined by the multinomial distribution 

fc+l 

p{m u ...,m M )=N\\\^-, (83) 

where 

e «iSf> e (ij— IK 

P,=Mn,)(i-U) — +f l {n, - \)U^j— . (84) 
w(t) w(t) 

Note that the effect of mutations is already implemented in the above algorithm, which is equivalent to the 
WF model in Section [2] (first selection then mutation). Since this multinomial distribution can be written 

as 

p(n n ,...,m k+l )=f\ ( Ni )(l- qi ) N --'"-q';\ (85) 

j=2 \ m </ 

where 

- - P ' (86) 



Ni = Nj + 1 — m, + 1 and 1 = N, the multinomially distributed random numbers can be generated by 
drawing binomial random numbers k times. To be specific, we first draw m^ + i from the distribution 



( N (^) 
= k,k-l, ...,2bytl 



then the nij are determined in the order of j = k, k — 1, . . ., 2 by the conditional distribution 



Finally, mi is given by N\ = N — Y^fLi m r 

Since it is not possible to generate integers as large as 10 100 in present day computers, in our sim- 
ulations of very large populations we treat the mi as real numbers. To be specific, we use the following 
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algorithm. If Nj < 10 9 , we generate binomially distributed integer random variables. If Nj > 10', we first 
check if the mean mj = Njqj is larger than prescribed number M which was set as 100 in our simulationfQ 
If mj < M, we generate Poisson distributed random numbers with mean m j . Since Nj is sufficiently large 
and qj < 10~ 7 , the Poisson distribution accurately approximates the binomial distribution in this situation. 
On the other hand, if rhj > M, we invoked the central limit theorem to approximate the binomial distri- 
bution by a Gaussian; that is, mj = + ^jNjqj(\ — qj)N(Q, 1), where A^O, 1) is a normally distributed 
random number with mean and variance 1 . 

Needless to say, the above algorithm is successful up to hyperastronomical population sizes because 
the fitness space is quantized and the number of possible fitness values at each generation, determined by 
the lead Lq, increases only as ~ InA'. The direct application of this method to model II is not feasible, 
because in that case the number of different fitness values is at least NU . 



The results do not depend on this choice. 



